Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

41 new feature read csv datasets #76

Merged
merged 20 commits into from
Jan 15, 2024

Conversation

BaptisteDlp
Copy link
Collaborator

@BaptisteDlp BaptisteDlp commented Dec 5, 2023

Keypoints for this pull request:

  • Divide the creation of the datasebase in defStormsDataset() into 2 subroutines depending on the extension of filename: getDataFromNcdfFile() and getDataFromNcdfFile()
  • Minors bug fix and missing unit conversion directive options for poci input

Here is a reproducible code for the test:

# Download a csv database file, for example ibtracs.SP.list.v04r00.csv from
# https://www.ncei.noaa.gov/data/international-best-track-archive-for-climate-stewardship-ibtracs/v04r00/access/csv/
fileName <- #Enter path to csv file here
fields = c(sid = "SID", names = "NAME",seasons = "SEASON", isoTime = "ISO_TIME", lon = "USA_LON",
           lat = "USA_LAT", msw = "USA_WIND", basin = "BASIN", rmw = "USA_RMW", sshs = "USA_SSHS",
           pressure = "USA_PRES", poci = "USA_POCI")
seasons = c(2015, 2020)


sdsFromCsv <- defStormsDataset(filename = fileName, fields = fields, seasons = c(2015, 2020))
s <- defStormsList(sdsFromCsv, loi="Vanuatu")
plotStorms(s, names="PAM")

@BaptisteDlp BaptisteDlp linked an issue Dec 5, 2023 that may be closed by this pull request
Copy link

codecov bot commented Dec 5, 2023

Codecov Report

Attention: 27 lines in your changes are missing coverage. Please review.

Comparison is base (bbae6df) 72.38% compared to head (ac0fb83) 73.86%.

Files Patch % Lines
R/defStormsDataset.R 83.22% 27 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##           master      #76      +/-   ##
==========================================
+ Coverage   72.38%   73.86%   +1.47%     
==========================================
  Files           8        8              
  Lines        1492     1584      +92     
==========================================
+ Hits         1080     1170      +90     
- Misses        412      414       +2     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

R/defStormsDataset.R Outdated Show resolved Hide resolved
R/defStormsDataset.R Outdated Show resolved Hide resolved
R/defStormsDataset.R Outdated Show resolved Hide resolved
R/defStormsDataset.R Outdated Show resolved Hide resolved
R/defStormsDataset.R Outdated Show resolved Hide resolved
R/defStormsDataset.R Outdated Show resolved Hide resolved
Copy link
Contributor

@thomasarsouze thomasarsouze left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • Please add tests.
  • Please add examples in the documentation of the function

@BaptisteDlp
Copy link
Collaborator Author

  • Please add tests.

    • Please add examples in the documentation of the function

How would you like to add a example ? We need either a example of csv file (like the one we in netcdf) or something else I do not know ... Any suggestions ?

Same for the tests, we need a csv file to test ...

@thomasarsouze
Copy link
Contributor

  • Please add tests.

    • Please add examples in the documentation of the function

How would you like to add a example ? We need either a example of csv file (like the one we in netcdf) or something else I do not know ... Any suggestions ?

Same for the tests, we need a csv file to test ...

It can be good idea to have the same sample dataset that we already have in netcdf, in csv format. This could allow testing the csv functions, but also testing that we have same results, no matter the format of the dataset.

@BaptisteDlp
Copy link
Collaborator Author

  • Please add tests.

    • Please add examples in the documentation of the function

How would you like to add a example ? We need either a example of csv file (like the one we in netcdf) or something else I do not know ... Any suggestions ?
Same for the tests, we need a csv file to test ...

It can be good idea to have the same sample dataset that we already have in netcdf, in csv format. This could allow testing the csv functions, but also testing that we have same results, no matter the format of the dataset.

Can you add it in the code (internal-data.R ?) like you did for the ncdf

@thomasarsouze
Copy link
Contributor

Can you add it in the code (internal-data.R ?) like you did for the ncdf

Done with 0f7a178 (in inst/extdata)

@BaptisteDlp
Copy link
Collaborator Author

BaptisteDlp commented Dec 14, 2023

I added an example in the function, + several new and missing tests

R/defStormsDataset.R Outdated Show resolved Hide resolved
R/defStormsDataset.R Outdated Show resolved Hide resolved
@BaptisteDlp BaptisteDlp removed the request for review from thomaspibanez December 21, 2023 00:22
pressure = "usa_pres",
poci = "usa_poci"
)
SP_2015_2020_csv <- defStormsDataset(fields = fields, seasons = c(2015, 2020))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
SP_2015_2020_csv <- defStormsDataset(fields = fields, seasons = c(2015, 2020))
SP_2015_2020_csv <- defStormsDataset(system.file("extdata", "test_dataset.csv", package = "StormR"), fields = fields, seasons = c(2015, 2020))

Comment on lines +163 to +184
Hereafter, an example where we access the data with `test_dataset.csv`.
Notice that the header of the csv and the names of columns in `fields` input must be equals
```{r chunk 4}
# Header of the csv
head(read.csv(system.file("extdata", "test_dataset.csv", package = "StormR")))
```


```{r chunk 5}
# Is already the default setting (in this particular case)
fields <- c(
names = "name",
seasons = "season",
isoTime = "iso_time",
lon = "usa_lon",
lat = "usa_lat",
msw = "usa_wind",
sshs = "usa_sshs",
rmw = "usa_rmw",
pressure = "usa_pres",
poci = "usa_poci"
)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Hereafter, an example where we access the data with `test_dataset.csv`.
Notice that the header of the csv and the names of columns in `fields` input must be equals
```{r chunk 4}
# Header of the csv
head(read.csv(system.file("extdata", "test_dataset.csv", package = "StormR")))
```
```{r chunk 5}
# Is already the default setting (in this particular case)
fields <- c(
names = "name",
seasons = "season",
isoTime = "iso_time",
lon = "usa_lon",
lat = "usa_lat",
msw = "usa_wind",
sshs = "usa_sshs",
rmw = "usa_rmw",
pressure = "usa_pres",
poci = "usa_poci"
)
Hereafter, you can see the header of the dataset sample `test_dataset.csv` embbeded in the package.
Notice that the first line of the file must contain the names of columns and correspond to the `fields` input.
```{r chunk 4}
# Header of the csv
head(read.csv(system.file("extdata", "test_dataset.csv", package = "StormR")))

You can then define de stormsDataset object as follows:

# Is already the default setting (in this particular case)
fields <- c(
  names = "name",
  seasons = "season",
  isoTime = "iso_time",
  lon = "usa_lon",
  lat = "usa_lat",
  msw = "usa_wind",
  sshs = "usa_sshs",
  rmw = "usa_rmw",
  pressure = "usa_pres",
  poci = "usa_poci"
)

@BaptisteDlp BaptisteDlp merged commit 95b9c16 into master Jan 15, 2024
9 checks passed
@thomasarsouze thomasarsouze deleted the 41-new-feature-read-csv-datasets branch July 24, 2024 13:31
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[New Feature]: Read .csv Datasets
2 participants